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C/3 ■ Abstract 

In this paper, we consider lower bounds on the query complexity for testing CSPs in the 
c*2 . bounded-degree model. 

First, for any "symmetric" predicate P : {0, l} fc — > {0, 1} except EQU where k > 3, we 

show that every (randomized) algorithm that distinguishes satisfiable instances of CSP(P) from 

instances (|P _1 (0)|/2 fe — e)-far from satisfiability requires fl(n 1 ^ 2+s ) queries where n is the 

number of variables and 5 > is a constant that depends on P and e. This breaks a natural 

lower bound ^(n 1 / 2 ), which is obtained by the birthday paradox. We also show that every 

rvq , one-sided error tester requires Q,(n) queries for such P. These results are hereditary in the sense 

that the same results hold for any predicate Q such that P _1 (l) C Q _1 (l). For EQU, we give 

r — | , a one-sided error tester whose query complexity is O^n 1 ^ 2 ). Also, for 2-XOR (or, equivalently 

E2LIN2), we show an fl(n 1 / 2+s ) lower bound for distinguishing instances between e-close to and 

^^ , (1/2 — e)-far from satisfiability. 

Next, for the general fc-CSP over the binary domain, we show that every algorithm that 
distinguishes satisfiable instances from instances (1 — 2k/2 k — e)-far from satisfiability requires 
Cl(n) queries. The matching NP-hardness is not known, even assuming the Unique Games 
Conjecture or the d-to-1 Conjecture. As a corollary, for MIS on graphs with n vertices and a 
degree bound d, we show that every approximation algorithm within a factor d/poly log d and 
an additive error of en requires Cl(n) queries. Previously, only super-constant lower bounds were 
known. 
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1 Introduction 

Property testing [13J is a relaxation of decision. We call a randomized algorithm an (7, e) -tester 
when, given an oracle access 0$ to an instance $, it accepts $ if it is 7-close to a predetermined 
property with a probability of at least 2/3 and rejects <£ if it is e-far from the property with a 
probability of at least 2/3. An (7, e)-tester is often referred to as a tolerant tester [2T]. The 
efficiency of an algorithm is measured by the query complexity, which is the number of accesses to 
0$. The definition of farness depends on each model. A (0, e)-tester is simply called an e-tester. 

In this paper, we study testers for £;-CSP (constraint satisfaction problems) in the bounded- 
degree model and show various lower bounds on the query complexity. An instance $ of /c-CSP 
is a tuple of a set of variables and a set of constraints (functions) over k variables. Then, we test 
whether there exists an assignment over variables that satisfies all the constraints. We only consider 
Boolean CSPs. The degree of a variable x is the number of constraints in which x appears. In the 
bounded- degree model |15| . we only consider instances such that the degree of each variable is at 
most d, where d is a predetermined parameter. By specifying a variable x and an index i(l < i < d), 
the oracle 0$ returns the i-th constraint in which x appears. If there exists no such constraint, 
C?$ returns some unique symbol. An instance $ is called e-far from satisfiability if we must remove 
at least edn/k constraints to make $ satisfiable. An instance <3? is called e-close to satisfiability 
if we can make $ satisfiable by removing at most edn/k constraints. Let P : {0, 1} — > {0, 1} 
be a predicate (a function). Then, CSP(P) is a sub-problem of fc-CSP in which every constraint 
is specified by the same predicate P and literals on it (see Section [2] for details). For a concrete 
predicate, we often use P as the name of a problem instead of writing CSP(P) (e.g., k-XOR). 

The first contribution of this paper is the development of a new technique to show lower bounds 
for testing a wide range of CSP(P). A predicate P is called symmetric if the following conditions 
hold: (i) P(x) = P(y) for any x,y G {0, l} k such that |x| = \y\. (ii) P(x) = P(x) for any x £ {0, 1} 
where x = (1, . . . ,1) — x. We assume |P _1 (1)| > throughout this paper. The simplest symmetric 
predicates might be A;-EQU : {0, l} fe — > {0,1}, which is satisfied iff the variables are all zeros or 
all ones, and A;-NAE : {0, 1} — > {0, 1}, which is satisfied iff not all of the variables have the same 
value. A;-NAE is much related to coloring on A;-uniform hypergraphs. We show the next theorem. 

Theorem 1.1. Let P : {0, 1} — > {0, 1} be a symmetric predicate except k-EQU where k > 3. 
Then, for any e > and predicate Q : {0, l} k — > {0,1} such that P" 1 (l) C Q _1 (l), there exist 
5 = 0(l/log(k/e 2 )) and d = 0(l/e 2 ) such that every (\Q~ 1 (0)\/2 k - e) -tester for CSP(Q) with a 
degree bound d requires J7(n 1 ' 2+ ) queries, 

We note that a (|Q _1 (0)|/2 fc )-tester is trivial since no instance can be (|(5 _1 (0)|/2 fc )-far from 
satisfiability and we can always accept. Thus, Theorem 11.11 excludes the possibility of efficient 
non-trivial testers. We also stress that it is impossible to get rid of the condition of symmetry since 
for a certain non-symmetric CSP, called Dicut, we have a constant-time non-trivial tester using 
recent results |244 128] . The lower bound J7(n 1 ' 2+ ) is somewhat surprising since, as we will see in 
Section 12.2^ this lower bound implies that even if we find cycles in the instance, they do not help 
at all to test the satisfiability. 

fc-XOR is a predicate of arity k, which is satisfied iff the parity of its variables is 1. We show a 
similar lower bound for 2-XOR. 

Theorem 1.2. For any e > 0, there exist 5 = 0(e/log(/c/e 2 )) and d = 0(1/ e 2 ) such that every 
(e, 1/2 — e) -tester for 2-XOR with a degree bound d requires il(n 1 ' 2+ ) queries. 

If an e-tester always accepts satisfiable instances, it is called a one-sided error tester. Otherwise, 
it is called a two-sided error tester. We give a tight lower bound for one-sided error testers. 



Table 1: Summary of results on query complexity of two-sided error (7, e)-testers for various prob- 
lems. Here, P denotes any symmetric predicate with arity k > 3 except fc-EQU. 
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Theorem 1.3. Let P : {0, 1} — > {0, 1} be a symmetric predicate except k-EQU where k > 3. Then, 
for any e > and any Q : {0, l} k ->■ {0, 1} suc/i tfjcrf i 3-1 ^) ^ Q _1 (l)> ^ ere exists d = 0(l/e 2 ) 
such that every one-sided error (\Q~ 1 (0)\/2 — e) -tester for CSP(Q) with a degree bound d requires 
Q,{n) queries. 

On the other hand, /c-EQU is an easier problem as stated in the next theorem. 

Theorem 1.4. For any e > 0, d > 1 and k>2, there exists a one-sided error e-tester for k-EQU 
with query complexity 0(n 1 ' 2 poly(dklogn/e)). 

Bipartiteness is the property of a graph such that the vertex set can be partitioned into two 
disjoints sets U and V such that every edge connects a vertex of U and a vertex of V. Theorem 11.41 
is almost tight since testing bipartiteness is a sub-problem of 2-EQU and the Q(-^/n) lower bound 
is known for this problem |15j . 

The second contribution of this work is a linear lower bound to distinguish satisfiable instances 
of the general /c-CSP from instances much further from satisfiability. 

Theorem 1.5. For any e > and k > 3, there exists d = 0(l/e 2 ) such that every (1 — 2k/2 — e)- 
tester for k-CSP with a degree bound d requires f)(n) queries. 

As a corollary, we show a linear lower bound for approximating Maximum Independent Set 
(MIS). An independent set of a graph is a vertex set such that any two of its vertices are not 
adjacent. MIS is the problem of finding the largest independent set in a graph. A value x is called 
an (a, /3) -approximation for a value x* if x* < x < ax* + f3. We call a randomized algorithm an 
(a, (3) -approximation algorithm for MIS if, given an oracle access Oq to a graph G, it computes an 
(a, /3)-approximation for MIS with a probability of at least 2/3. Similarly to /c-CSP, by specifying 
a vertex v and an index i(l < i < d), the oracle Oq returns the i-th edge in which v appears. We 
show the next theorem. 

Theorem 1.6. Every (d/poly log d, en) -approximation algorithm for Maximum Independent Set on 

graphs with n vertices and a degree bound d requires Q(n) queries. 



Related work: There have been several works on testing CSPs. The summary of known results 
is shown in Table [TJ Max A:-CSP is an optimization version of A:-CSP in which we are to maximize 
the number of satisfied constraints by an assignment. Let P be a predicate of arity k. We notice 
that if there is an approximation algorithm for Max CSP(P) with a factor j5^, we have (7, e)-tester 



for CSP(P). Thus, a lower bound for testing CSP(P) implies a lower bound for approximating Max 
CSP(P). The NP-hardness of approximation within a certain factor is often shown using a reduction 
from 3-XOR. Using the same reduction, for a wide range of P, it is shown that there exists some 
i] > such that any 77-tester requires f2(n) queries [29] (e.g., rj = 1/12 for 2-XOR [16] and rj = 1/16 
for 3-NAE [31]). However, for e > rj, we did not have any lower bound for e-testers. Theorems ll.il 
and 11.21 tighten this gap and also imply the new lower bound 0(n 1 ' 2+ ) for approximating Max 
CSP(P) within a factor \p- l (l)\/2 k + e. 

Assuming the Unique Games Conjecture [T7], it is NP-hard to distinguish between instances 
of Max /c-CSP whose optimal solutions are 1 — e and (k + o(k))/2 + e [U [22]. Theorem 11.51 states 
a somewhat stronger fact about sublinear time algorithms; i.e., it is hard to distinguish satisfiable 
instances from instances whose optimal solutions are at most 0(k)/2 with sublinear queries. No 
matching NP-hardness is known, even assuming the d-to-1 Conjecture |17j . 

The concept of (a, en)-approximation algorithms was introduced in [TO] to approximate the 
minimum spanning tree of a bounded-degree graph. Since then, numerous (a, en)-approximation 
algorithms have been developed for graph problems [H [TOJ IS [TOJ [201 ED]- For MIS, it is shown 
that there exists a constant-time (0(dloglogd/logd), en)-approximation algorithm, and every 
(o(<i/logGf),en)-approximation algorithm requires a super-constant number of queries pQ. The- 
orem [L6] improve this to a linear lower bound at the cost of a slightly weaker approximation factor. 

Motivations: Among all CSPs, we could say that A;-X0R is the one whose behavior is best 
understood. It is NP-hard to distinguish between instances whose optimal solutions (in the sense 
of Max fc-XOR) have values 1 — e and 1/2 + e [16]. This fact means that the random assignment 
achieves the best approximation ratio one can obtain in polynomial time. The behavior of Max 
A;-X0R under linear programmings (LP) and semidefinite programmings (SDP) is also well-studied. 
A quality of linear and semidefinite programming are measured by integrality gap, which is the ratio 
of the optimum for those programs to the optimum for the original problem. The Lovasz-Schrijver 
hierarchy (LS, LS+), Sherali-Adams hierarchy (SA), Lasserre hierarchy are sequences of relaxations 
of those programs to obtain tighter approximations. For all of these hierarchies, the integrality gaps 
remain 2 — e after il(n) rounds of relaxations [91 [TT| I23j. 

Presumably, the reason why Max k-XOR is hard to approximate is that the accepting assign- 
ments of fc-XOR contain the support of a (k — l)-wise independent distribution. These results 
are extended to predicates whose accepting assignments contain the support of a pairwise inde- 
pendent distribution [UELllEB]- For other predicates, however, we can approximate better than 
the random assignment using SDP (e.g., 3-NAE). One motivation for this work is to investigate 
why SDP helps with those predicates. Theorems 11.11 and 11.21 suggest that a few cycles are not 
sufficient to approximate better than the random assignment. This holds not only for SDP, but 
also for any algorithm. Also, Theorem 11.31 gives us a separation of the ability of polynomial-time 
algorithms versus sublinear-time one-sided error testers since SDP approximates better than the 
random assignment in polynomial time. 

It is an interesting question whether we can approximate Max A;-CSP within a certain factor by 
sampling a small portion of an instance. We can approximate the optimal solution of Max A:-CSP 
within an additive error en by sampling poly(l/e) variables and by solving the induced problems [2]. 
Thus, dense instances are easy to approximate with constant queries [21 OH]. However, little is 
known for sparse instances. Solving Max Cut of a sparse graph by sampling is demonstrated in [6]. 
They showed that the value of Goemans- Williamson SDP [12J for a randomly sampled subgraph of 
linear size is approximately equal to the SDP value for the original graph. Our work is a complement 
of their work. Theorem 11.11 implies that, to approximate symmetric Max fc-CSP better than the 



random assignment, we need to sample Q(n ' + ) constraints from the instance. 

Organization: In Section [2j we define notions used in this paper, followed by a proof overview of 
Theorem II .1\ which is the main result of this paper. We give the proof of Theorem II .11 in Section [3j 
We mention other results in Section [U 

2 Preliminaries 

2.1 Definitions 

We define notions on hypergraphs. Let {v±, . . . , v p } be a vertex set and {ei, . . . , e p _i} be an edge 
set such that &i contains Vi and Vi+\ for 1 < i < p — 1. Then, we call {ei, . . . , e p } a hyperpath. A 
hypergraph is called connected if, for every two vertices, there is a hyperpath containing them. Let 
{vi , . . . , v p } be a vertex set and {ei , . . . , e p } be an edge set such that &i contains Vi and vu mot j p ) +1 for 
1 < i < p. Then, we call {e%, . . . , e p } a hypercycle. A connected hypergraph is called a hypertree if it 
does not have any hypercycle. A hyperforest is a hypergraph such that each connected component 
is a hypertree. Let H be a fc-uniform hypergraph with n vertices, m edges and c connected 
components. We define cy(H) = (k — l)m — n + c, which measures how many vertices are deficient 
compared to a hyperforest (note that any hyperforest with m edges and c connected components 
has (k — l)m + c variables). We call H a (7,77)- expander if the subgraph of -ff induced by any 
s < 7n edges contains at least (k — 1 — rj)s vertices. 

Let P : {0, l} k — > {0, 1} be a predicate. An instance <£ of CSP(P) is a tuple of a set of variables 
and a set of constraints. Here, each constraint C is defined over a /c-tuple of variables (xi, . . . , Xk) 
and is of the form P(x\ + 61, . . . , x^ + 6^) = 1 for some (61, ... , 6^) G {0, l} fc . We call (b\, . . . , bf.) a 
literal vector of C. Here, 6j accounts for the possible negation of x«. The underlying hypergraph of 
$ is a A;-uniform hypergraph if in which each variable of $ corresponds to a vertex of H, and for 
each constraint of the form P{x\ + b\, . . . , Xk + bk) = 1 in $, we have an edge (xi, . . . , Xfc) in H. 

Next, we introduce notions on distributions. Suppose that P is a distribution generating xi,X2 
(and possibly others). Let T>(pci) be the marginal distribution of xi under T>. Let 2?(xi|x2 = X2) 
denote the marginal distribution of xi conditioned on X2 = X2, i.e., Prx>( Xl | X2=a;2 )[xi] = Prx>[xi|x2 = 
X2]. We often omit the actual value of a random variable if it is unimportant. For example, 
2?(xi|x2 = X2) may be written as 2?(xi|x2) and X^x^ >r [ x = x \ ma y be written as X^x^ >r [ x ]- ^et 
Supp(D) denote the support of V. If the random variables xi and X2 become independent after 
conditioning X3, we write xi _LL X2 | X3. Let {x. v } v< ^v be a set of random variables. Then, for 
S C V, X5 denotes the set {x v } v( zs- 

Let T>\ and T>2 be distributions generating a random variable x. The total variation distance 
between T>\ (x) and 2?2( x ) is defined as 



d2v[Z>i(x),Z> 2 (x)] V ' 



x 



Prfxl - Prfxl 



1 



We note that < dry[^ , i(x),2?2(x)] < 2. Also, we define dry[P(x)] = d T v[^(x),W(x)] where W is 
the uniform distribution. When x is Boolean, < dTv['D( x )} ^ 1- 

2.2 Proof Overview 

We give a proof overview of Theorem 11.11 To prove the lower bound, we use Yao's minimax 
principle [27} . Specifically, we design two distributions 2? sa t and Df ar of instances of CSP(P) so 



1 This is twice as large as the standard definition. We use this definition to avoid unnecessary calculations. 



that all instances of D sa t are satisfiable, while almost all instances of Df ar are (|P -1 (0)|/2 — e)- 
far from satisfiability. Then, we show that any deterministic algorithm with a sublinear number 
of queries cannot distinguish between instances chosen from D sat and instances chosen from "Df ar . 
For underlying hypergraphs of T> sat and Pf ar , we use the same distribution of expanders. Thus, 
if we ignore literal vectors and we only look at variables used in constraints, we have no hope of 
distinguishing X> sat from £>f ar . We describe how V sat generates an instance. First, V sat chooses an 
underlying hypergraph H = (V,E). Then, the set of variables of the instance is {x v } v ^v Then, 
P sa t first chooses x„ G {0, 1} for each vertex v G V uniformly at random. Here, the set {x„}„ G y 
is the supposed solution for the instance. Next, 2? sat chooses a literal vector b e for each edge 
e = (v±, . . . , Vk) and adds a constraint C e of the form P((x Vl , . . . , x Vk ) + b e ) = 1. V sat chooses 
b e so that the resulting instance is satisfiable by {x v } ve y. In contrast, Df ar simply generates b e 
uniformly at random for each edge e after choosing an underlying hypergraph. 

Any algorithm with query complexity £ can be seen as a mapping from query- answer history 
(qi, a±), . . . , (qt-i, «t-i) to q t for t < £ and to {accept, reject} for t = £. A query q t = (vt,it) is a 
pair of a variable vt and an index it, and an answer at is a constraint or the information that there 
is no constraint there. To analyze the distribution of the query-answer history of an algorithm 
running under a distribution of instances, it is useful to think that there is a randomized process 
behind the oracle. That is, when an algorithm asks a query of the oracle, the randomized process 
generates the answer to the query according to some distribution. We later define a randomized 
process V sa t (resp., V{ ar ), which is equivalent to P sat (resp., Pf ar ) in the sense that no matter how an 
algorithm asks the oracle, the distribution of instances we finally obtain is the same as T> sat (resp., 
Pf ar ). Let /C sa t (resp., /Cf ar ) be the distribution of query-answer history induced by the interaction 
between an algorithm A and V sat (resp., Vf^). We show that when the query complexity of A is 
o(n 1 / 2+<5 ) f or some S > 0, oItv [^sat , ^Qar] is negligibly small. Thus, it is impossible to distinguish 
D sa t from Pf ar with high probability. 

If we ask at most 0{n 1 ' 2 ) queries, from the birthday paradox, the query-answer history does 
not contain hypercycles with high probability. From this fact, it is relatively easy to show that we 
cannot distinguish D sat from Df ar with 0{n 1 ' 2 ) queries. However, if we ask Q{p}' 2+ ) queries, the 
situation completely changes because of the effect of hypercycles. For example, suppose that the 
predicate is EQU and A obtained a constraint C e such that variables x u ,x v G C e already appeared 
in the query-answer history. Then, A can calculate the parity x u © x v , by the propagation, along 
the constraint C e and along a path in the query-answer history. If they are not the same, the 
instance must come from Pf ar . In other words, if we assume that the instance comes from T> sat , we 
can guess b e from the query-answer history. 

Can we generalize this algorithm to other predicates? Though we do not exclude the possibility 
of sublinear-time algorithms, we can show that, in general, we need quite a few hypercycles to 
distinguish D sat from "Df ar . The reason why we were able to use the propagation is that the 
value of a variable in a predicate EQU uniquely determines the values of other variables. For 
other symmetric predicates, however, this is not true. In fact, the correlation between variables 
exponentially decays along paths. Thus, even if variables x u and x v already appeared in the query- 
answer history, the correlation between x u and x„ is tiny (before obtaining C e ). Precisely, we 
will show that drv[£'sat(xu|x 1 ,,b£/),'D sa t(x u |b£/)] is tiny where E' is the edge set in the query- 
answer history. Thus, b e is almost identical to the uniform distribution. It follows that we cannot 
distinguish T> s&t from "Df ar with Oiw}' 2 ^ ) queries. 

To prove this, we use several facts about expanders. Note that the lengths of hypercycles 
are large (roughly, g = @(log d n)) in an expander. Thus, for two adjacent vertices u and v, the 
distance between them is at least g after removing the constraint containing them. Furthermore, 
the neighborhood of v looks like a hypertree T with depth g. Note that any information from x u 



comes through the leaves of T. Though the number of leaves of T is exponential in the depth, we 
can show that the only tiny portion of them is connected to u (without passing v). Since such 
leaves have an exponentially small correlation with x„, we conclude that the correlation between 
x n and x v is negligibly small. 

2.3 Properties of dxv 

We show several lemmas about (Itv an d probability distributions. Due to the space limit, all the 
proofs are deferred to Appendix |A"1 

Lemma 2.1. Let T>\ and T>i be distributions generating random variables x and y. Suppose 
that dTy[£>i(x),2?2(x:)] < 5 X , and c^ry[2?i(y|x = x),T>2(_y\x = %)] < $y for any x. Then, 
d T y[Pi(x,y),P 2 (x,y)] <S X + S y . 

Lemma 2.2. Let T> a distribution generating X/i,xb,xc. Suppose that xa -LL xp | xg. Then, 

dry[P(x c |xA)] < d TV [V(x B \x A )] ■ d TV [V(xc\x B )}. 

Lemma 2.3. Let D be a distribution generating x,yj(l < i < k). Suppose that Prx>[x = x] is equal 
for every x G Supp(P(x)) and y^ _LL yj | x for every 1 < i,j < k. Then, 

Pr[x = sKtttf-J = n ^ PrP l X = x|y ' ] . 

V Ea-'eSu PP (©(x)) I\i=i Pr »t x = x '\yi\ 

3 An n(n 1 / 2+<5 ) Lower Bound for Two-Sided Error Testers 

In this section, we give a proof of Theorem II. li A reader can safely assume that a predicate P is 
symmetric until the proof of Theorem 11.11 

3.1 Probabilistic Constructions of Expanders 

We introduce a probability distribution Q n ^k of ci-regular fc-uniform multi-hypergraphs with n 
vertices. This distribution is used to define 2? S at and 2?f ar . Here, we assume that dn is divisible by k 
(otherwise, no (i-regular /c-uniform hypergraph exists). We construct a hypergraph H = (V,E) as 
follows. We start with a set of dn vertices V where a vertex v € V is corresponding to d vertices in 
V' . Then, we partition V' into fc-hyperedges randomly. Finally, we contract each d vertices of V' 
and let H be the resulting graph. The proof of the following lemma is deferred to Appendix lB.il 

Lemma 3.1. Let H be a hypergraph chosen uniformly at random from Q n ,d,k- For any n, there 
exists 7 such that H is a (7,7/) -expander with probability 1 — o(l). 

3.2 Hard instances 

As in the proof overview, we introduce two distributions 2? sat and 2?f ar of instances of CSP(P). 
First, we define a distribution generating instances of CSP(P) given an underlying hypergraph. 

Definition 3.2. Let H = (V, E) be a k-uniform hypergraph with n vertices. Define a distribution 
T>h generating an instance $ of CSP(P) as follows. The variable set of ^ is {x v } v( zy. We choose 
x € {0, 1}™ uniformly at random. For each edge e = (yi, . . . ,i>fc) G E, we choose b e uniformly at 
random from the set {b G {0, 1} | P((x Wl , . . . , x Wfc ) + b) = 1}. Then, we add a constraint C e of the 
form P((x Vl ,... ,x Vk ) + b e ) = 1 to $. 



Definition 3.3. Given parameters n, d, k, define a distribution 2? sat generating an instance of 
CSP(P) as follows. First, we choose a hypergraph H from Q n ^d,k- Then, an instance is output 
according to Vjj. 

Similarly, define a distribution Pf ar generating an instance of CSP(P) as follows. First, we 
choose a hypergraph H = (V,E) from Q n dk- Then, for each edge e = (vi, . . . ,Vk) G E, we choose 
b E {0, l} k uniformly at random and add a constraint C e of the form P((x vi , . . . ,x v .) + b) = 1. 

We can describe the generating process of T> sat with a graphical model. Each vertex in the 
graphical model corresponds to x v (v £ V) or b e (e G E), and each edge expresses the dependency 
between two random variables. For an exposition of graphical models, see [TJ. The important fact 
derived from the graphical model is the following. 

Observation 3.4. Let H be a hypergraph and G = (V, E) be a subgraph of H. Let A, B, C be sets 
of vertices such that any path in G between A and C passes a vertex of B. Then, x^ _LL xp | x# 
under T>h{-\^e)- 

From the construction, any instance of T> sat is satisfiable. On the other hand, the following 
lemma is well-known (e.g., [231 126])- We provide a proof for completeness in Appendix IB. 21 

Lemma 3.5. For any e > 0, there exists an integer d > 1 for which the following holds. Let $ 
be an instance of CSP(P) chosen from Pf ar where P : {0, 1} — > {0, 1} is a predicate. Then, $ is 
(|i- > ~ 1 (0)|/2 fc — e)-far from satisfiability with a probability of 1 — o(l). 

3.3 Randomized processes equivalent to T> sat and £>f ar 

We show that, with high probability, any algorithm A with 0(n 1 > 2+ ) queries runs on distributions 
T> sat , or Df ar can find at most 0(n 3S ) cycles and the lengths of those cycles are f2(log dfc n). 

We define a randomized process "P sa t, which interacts with A, so that V sa ,t answers queries from 
A while constructing a random graph from Z> sa t- Thus, the interaction of V sa ,t with A captures a 
random execution of A on a graph uniformly distributed in T> sat ■ Similarly, we define a randomized 
process V{ ar , which imitates 2?f ar . 

The process V sa t has two stages. The first stage continues as long as A performs queries, 
and V sa t answers to those queries. In the second stage, Psat determines the rest of the instance. 
"P sa t internally holds a supposed solution {x^j^gy, which is hidden from A. Literal vectors are 
determined so as not to contradict this solution. 

First stage of V sa t: Starting from t = 1, for each query q t = (vt, it) of A, V sa t proceeds as follows. 
For each vertex u, we define remaining degree r(u) as the number of constraints adjacent to u which 
are not accessed yet by A. We choose U2,---,Uk with a probability according to their remaining 
degrees. Specifically, since the sum of remaining degrees of all vertices at the time that A specifies 
Vt is dn — (t — l)k + 1, the probability that a vertex u is chosen as U2 is r{u)/{dn — (t — l)k — 1). 
Similarly, the probability that u is chosen as M3 is r{u)/{dn — (t — l)k — 2) since the sum of the 
remaining degrees decreases by one. This process continues until Uk is chosen. Finally, form an 
edge e = («i = Vt,...,Uk). For each chosen vertex m G e, if the supposed solution x Ui is not 
determined yet, T^at chooses x Ui G {0, 1} uniformly at random. Then, V sa t chooses a literal vector 
b e G {0, l} fc uniformly at random from {b G {0, l} fc | F((x ui , . . . ,x ut ) + b) = 1}. Finally, V sa t 
returns the constraint C e of the form P((x Ul , . . . , x Ut ) + b e ) = 1 to A. 

Second stage of V sa t- Among all possibilities of the rest of the underlying graph, V sa t chooses 
one of them uniformly at random. Then, V sa t decides x„ and b e randomly in the same way as the 
first stage. 



The process Pfar proceeds in an almost identical manner. The only difference is that T-W does 
not keep track of the supposed solution and always chooses literal vectors uniformly at random. It 
is easy to confirm that the following lemma holds using indunction on the number of queries, and 
we omit the proof (see Lemma 7.3 of |15j for details). 

Lemma 3.6. For every algorithm A, the process V sa t (resp., Vi w ) uniformly generates instances 
o/Psat (resp., Pfarj when interacting with A. □ 

The proofs of the following two lemmas are deferred to Appendices IB.3I and IB.4I 

Lemma 3.7. Let 5 > and G be the hypergraph induced by the query-answer history after 
0(n . ' 2+ ) steps of interactions between an algorithm A and "P sat (or V{ w ). Then, with a prob- 
ability of at least 1 - o(l), cy(G) = 0{k 2 n 3S ). 

Lemma 3.8. Let 5 > and G be the hypergraph induced by the query-answer history after 
0(n 1 ' 2+s ) steps of interactions between an algorithm A and "P sat (or Vf ar ). Then, with a prob- 
ability of at least 1 — o(l), the girth of G is at least g = (^ — 25) log rffc n. 

3.4 Correlation decay along edges of a hypertree 

Let $ be an instance of CSP(P) generated by X> sa t- Suppose that T = (V,E) is a subgraph of the 
underlying graph of <J> and T is a hypertree. Let v £ V be a (arbitrary) root of T and L be a subset 
of leaves of T. In this subsection, we consider how the information of x/, propagates into x„ along 
edges of T. Specifically, we calculate dTvV^ h^v^Li^e) ^ h{^-v\^e)]- A proof of the next lemma 
is given in Appendix IB. 51 

Lemma 3.9. Let T = (V,E) be a subgraph of a hypergraph H. If T is a hypertree, then x„ and 
i>E are independent for any v £ V under T>u- 

From LemmaEm d T v\PH{^v\^-L:^>E)-,T^H{'^v\^>E)\ = dTv[T^H{^v\^L,^E)) holds. 
Next, we see how dxv['£ ) .ff(x„|x£, be)] propagates by connecting vertices at a vertex or an edge. 
Proofs of next two lemmas are given in Appendices IB. 6} and IB.7} respectively. 

Lemma 3.10. Let T = (V,E) be a subgraph of a hypergraph H. Suppose that T is a hypertree. 
Let Ti, . . . , Ti be the set of the subtrees obtained by splitting v € V and Lj(l < i < £) be a subset of 
the leaves ofT, L . Then, 



dTv[2?ff(x«|{x£jf =1 ,b E )] < S ^d TV [V H (y. v \^ Li ,h E )]. 



i=l 



Lemma 3.11. Let T = (V, E) be a subgraph of a hypergraph H . Suppose that T is a hypertree. Let 
Ti, . . . , Tfc_i be the set of subtrees obtained by removing e = (v\, . . . ,Vk) £ E. Here, Vi is the root 
of Ti. Let Li(l < i < k — 1) be a subset of the leaves of Ti. Then, 



k-l 



dTy[^(x„J{x Li }f- 1 1 ,b £ )] < P {P)Y,dTv[D H {^vML v h E )]. 



i=l 



Here, p{P) < 1 is a constant, which only depends on the (symmetric) predicate P. In particular, 
p{P) <lifP is not EQU. 



3.5 Putting things together 

Let p(P) be the constant determined in Lemma 13.111 

Lemma 3.12. Let G = (V,E) be a subgraph of a hypergraph H with girth g and let e £ E be an 
edge. Then, for any v £ e and S C e — {v}, dry [2?#(x„|xs, b E ^ e )\ < p(P) 9 (2cy(G — e) + k). 

Proof. Let T = (Vr, Et) be a subgraph of G induced by vertices whose distance from v in G — e 
is at most g. Note that T is a hypertree rooted at v since the girth of G — e is g. For a leaf u of 
T, let C u be the resulting connecting component containing u after removing T. We define L as a 
subset of leaves as follows. A leaf u is in L iff C u contains a vertex of S or C u is not a hypertree. 
Once L is connected to all vertices of S, each leaf u £ L involves a cycle. Thus, \L\ is at most 
2cy(G — e) + A;. From Lemma [ 



dTv[^H(x v \xs,b E -e)] < dTv[^H(xD|x L ,b£_ e )]d T y[£>H(xL|x5,b£_ e )] < 2d TV [T> H (x v \x L ,b E - e )]. 

For each leaf m / L of T, we can truncate edges of C u since they have no information about x u from 
LemmaES Also, b Er -LL b£_ e -.E T | xl- Thus, dTV"[£>ff(x v |x$,bE_ e )] < 2dTv[^ ) //(x«|xL,b£; T )]. 
Now, to calculate drv'p}.H'C K u|xL,b.© r )], we recursively use Lemmas 13.101 and 13.111 from leaves. 
For each leaf u of T, we consider d*ry[2?,tf(x u |x-L u ,b-E T )] where L u = {u} n L. We note that 
d7v[£>#(x u |xL u ,b£; T )] = 1 for u € L, and dry[£>fl-(x w |x£ u ,bE T )] = for a leaf u ^ L. Then, it is 
clear that d T vpH(^v\^L,b ET )} < p{P) 9 \L\ < p{P) 9 {2cy{G - e) + k). □ 

Lemma 3.13. Let G = (V,E) be a subgraph of a hypergraph H with girth g and let e & E be an 
edge in a hypercycle of G. Then, dTypDff (b e |b£_ e )] < kp(P) 9 (4:cy(G — e) + 2k). 

Proof. From LemmaE21 d T y(V H [b e |bB_ e ]) < ^y[P#(x e |b£_ e )]<iTy[:Dtf(b e |x e )] < 2d T y[P H -(x e |b B _ e )]. 
Let e = (vi,...,v k ). From Lemma [2^1 dTv\D H (x e \h E -e)] < J2i=i rf ry [^H (x^ | {x^ j^ , b E - e )]- 
From Lemma EM we have d T y[I>jj(x u J{x Uj .}5~ 1 1 ,bB_ e )] < p{P) 9 {2cy(G - e) + k) for 2 < i < k. 
This inequality holds for i = 1 since P//(x 1 , 1 |b£;_ e ) is a convex combination of 'D^(x^ 1 |x^ 2 = 
0, bg;_ e ) and P//(x Wl |x U2 = l,bs_ e ). Thus, the lemma holds. □ 

We show a weaker version of Theorem II. 1\ which is only for symmetric predicates. 

Theorem 3.14. Let P : {0, l} k — > {0,1} be a symmetric predicate except k-EQU where k > 3. 
Then, for any e > 0, there exist 5 > and d > 1 such that every (|P _1 (0)|/2 — e)-tester for 
CSP(P) with a degree bound d requires 0(n ' 2+ ) queries, where 5 = 0(l/log(/c/e 2 )). 

Proof. Suppose that there exists a deterministic (|P _1 (0)|/2 — e)-tester A for CSP(P) with query 
complexity t = o(n 1 ' 2+ ). We choose 5 > later. From Lemmas 13.71 and 13.81 by union bound, 
A finds vertices in the current query-answer history c = 0(k 2 n ) times at most and the length 
of found cycles is at least g = (1/2 — 25) log dfc n with a probability 1 — o(l). In what follows, we 
condition on these events. 

We consider a decision tree 7^ a t generated by interactions between A and V sa ,t- To define T sa ,t, 
we suppose that an interaction between A and T^at proceeds in two steps; i.e., V sai t first returns a 
set of k variables X that will be used in the answer constraint, and then returns a literal vector b 
for the constraint. Corresponding to these two steps, 7^ a t has two kinds of vertices, i.e., S-vertices 
(state vertices) and L-vertices (intermediate vertices). In any path of the tree from the root to a 
leaf, .S-vertices and /-vertices appear alternately. Each 5-vertex v corresponds to a particular state 
of the query-answer history. When A obtains a set of variables X from V s ^t, the state proceeds to a 
/-vertex u, which is a child of v. The edge (v, u) is associated with X and the transition probability. 
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After that, A obtains a literal vector b from 7-gat- Then, the state proceeds to an S'-vertex v' , which 
is a child of u. The edge (u, v') is associated with b and the transition probability. The tree 7far is 
similarly defined. In particular, 7^ a t and 7f ar are isomorphic. 

We consider couplings of corresponding vertices in 7^ a t and 7f ar (one is mapped to another by the 
isomorphism). Suppose that t> sat and t>f ar are a pair of coupled S- vertices. Since the distributions 
of variables returned by the first step of an interaction are identical between V S3bt and V{ ar , the 
transition distributions to their children are identical. Next, suppose that t> sa t and t>f ar is a pair of 
coupled /-vertices. If the constraint returned by the previous step does not form a new hypercycle, 
the transition distributions to their children are identical. If the constraint forms a new hypercycle, 
from Lemma [3.13l the total variation distance between the distributions of literal vectors is at most 
k(4c + 2k)p(P) 9 . With the probability corresponding to this distance, we suppose that A succeeds 
in distinguishing V sa .t from V^r and terminates. This always makes A more powerful. After this 
modification, the transition distributions to their children become identical. After all, for any pair 
of coupled leaves of T sa ,t and 7f ar , the transition probabilities from the root to them are the same. 
Thus, we cannot distinguish "P sa t from "Pf ar if we reach a leaf of the decision tree. 

Thus, it amounts to calculate the sum of the discarded probabilities. Suppose any path p from 
the root to a leaf of T sa t- Then, since A finds vertices in the query-answer history c times at most, 
the sum of the discarded probability in p is at most c/c(4c + 2k)p{P) 9 . Since the tree is a convex 
combination of paths (with respect to transition probabilities), the total discarded probability is 
at most ck(Ac + 2k)p(P) 9 . By choosing 5 = 0(l/log(/c/e 2 )), this value becomes a small constant. 
Thus, (Itv [^Csat , K-i&v] "C 1 where /C sat (resp., /Cf ar ) is the distribution of the query-answer history 
induced by the t steps of interactions between A and Psat (resp., "Pf ar ). 

Since A is a (|P _1 (0)|/2 fc — e)-tester, PrL4(/C sat ) = accept] > 2/3. On the other hand, since a 
1— o(l) fraction of instances of Pf ar is (\P~ 1 (0)\/2 k — e)-far from satisfiability, Pr[^4(/Cf ar ) = accept] < 
(1 — o(l))g + o(l) = g + o(l). This is, however, a contradiction since dry[^sat,/Cf ar ] <1. □ 

Proof of TheoremEJl Let Q : {0, l} fc -)• {0,1} be a predicate such that P' 1 ^) C Q _1 (l) for 
a symmetric predicate P : {0, l} k — > {0,1} except fe-EQU. We slightly change the definition of 
T>h- That is, for each edge e = (v\, . . . , vt) of H, we choose b e uniformly at random from the set 
{6G{0,l} fc \P((x Vl ,...,x Vk ) + b) = l} instead of {b G {0, l} k | Q((x Vl , . . . ,x Vfe ) + b) = 1}. Then, 
the rest of the proof is the same as the proof of Theorem 13.141 □ 

4 Other Results 

A proof of Theorem 11.21 is given in Appendix The proof is similar to the proof of Theorem 1 1.11 
The modifications we need are the construction of T> S a,t- Specifically, we introduce noise to the 
literal vector of each constraint with probability e so that we cannot use propagation anymore to 
guess the value of variables. A proof of Theorem 11.31 is given in Appendix |Dl Any one-sided error 
tester A cannot reject an instance $ until A finds the evidence that $ is not satisfiable. For a hard 
instance, we use an instance obtained from 2?f ar , which is defined in Section I3.2L We show that it 
is far from satisfiability while any linear size sub-instance of it is satisfiable. This leads to a linear 
lower bound. A proof of Theorem ll.4l is given in Appendix|El We reduce fc-EQU to the problem of 
testing bipartiteness of a graph, and finally we use a tester for bipartiteness given in [Hj. Proofs 
of Theorems 11.51 and 11.61 are given in Appendix |FJ Similar to [26] , we define a predicate P using 
Hamming code. Using algebraic properties of Hamming code, we show that the CSP(P) is hard to 
test with sublinear queries. Our proof can be seen as an extension of the proof of [8], which showed 
a linear lower bound for testing 3-XOR. We prove Theorem 11.61 using a reduction from the hardness 
of k-CSP. 
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Appendix 



A Proof of Subsection I2HJ1 

A.l Proof of Lemma 12.11 
Proof. 



£ 

x,y 

£ 

x,y 



Pr[x,y] -Pr[x,y] 



Prfxl - Pr[xl 



£ 

x,y 

Pr[y|x]+^Pr[x] 



Pr[x]Pr[y|x]-Pr[x]Pr[y|x] 

Ul U\ U2 L>2 



T>1 



x,y 



xv 



Pr[y|x] - Pr[y|x] 



< ^4Pr[y|x]+^Pr[x]^ = 4+5 r 
y x x 



A. 2 Proof of Lemma EH 

Proof. We consider the following value. 

J](Pr[x B |x A ] - Pr[x B ])(Pr[x c |x B ] - Pr[x c ]) 

XB 

= J^ ((Pr[x B |x A ] - Pr[x B ]) Pr[x c |x B ] - (Pr[x B |x A ] - Pr[x B ]) Pr[x c ]) 

x s 

= ^(Pr[x c ,x B |x A ] - Pr[x B ,x c ]) (from x A -U_ x c | x B ) 

x B 

= Pr[x c |x A ] -Pr[xc]. 



Then, 



d T v ] P(yic\yiA)-,V(y.c)] = ^ |Pr[xc|x A ] - Pr[xq] 



xc 



E 



xc 



^(Pr[x B |x A ] - Pr[x B ])(Pr[x c |x B ] - Pr[x c ]) 

x s 

- EEd F *{*b\xa] - Pr[x B ]| • | Pr[x c |x B ] - Pr[x c ]|) 

xc x s 

< ^ | Pr[x B |x A ] - Pr[x B ]| • ^ | Pr[x c |x B ] - Pr[x c ]| 

x s x c 

< dTv[2?(x B |x A )] •dryp(xc|xfl)]. 



A.3 Proof of Lemma [231 

Proof. We use induction on fc. When A; = 1, we have nothing to prove. 



□ 



□ 
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Suppose that the lemma holds when k < t. We will show that the lemma also holds when k = t. 
In fact, 

Pr[x = x|{y i }J = i] 

Pr [x = x\y t \ Pr[{yj} j~j |x = x, y t ] 

Pr[x = x|y f ]Pr[{yjy|x = x] 

= ;— i (trom y+ _LL y,- x) 

E,' e su P p(^)) Pr[x = x'|y t ] Pr[{yJ*-}|x = x'] 

Pr[x = x|y f ]Pr[x = a|{yi}j~j]Pr[{ yi }j-j]/Pr[x = x] 
E^eSupp^W) Pr t x = x '\yt] P ^[x = x'|{yJti]Pr[{ya*-i]/Pr[x = x>] 
Pr[x = x| yi ]Pr[x = x|{ yi }*-J] 



t-ii " 



Ex'esuppfccx)) Pr t x = x/ ly<] Pr t x = z 'Ky<}*=i] 



n-IiPr P [x=x|y 8 



E^' 6 su P p(r>(a:)) rii = i p r-n[x— cc']yi 



(Pr[x = x] is equal for every x) 



, we get the desired result. 



□ 



Substituting Pr[x = x|{yi}| = J 

B Proof of Section [3] 

B.l Proof of Lemma 13.11 

Proof. Fix a set of s (random) hyperedges S and a set of cs vertices X where c = k — 1 — rj. We 
consider the probability that every hyperedge of S is contained in X. Since cs vertices are involved 
with at most csd hyperedges, s hyperedges determine at most ks neighbors, This is upper-bounded 
by 



csd\ .[net 
ks '{ks 



[csd) ks \ f(nd — ks) ks 



(ks) 



(ks)l 



< 



csd 



nd — ks 



ks 



< 



csd 



(d — kj)n 



ks 



For a fixed s, X can be chosen in (") ways and S can be chosen in ( ™) ways. Thus, the 
probability that such an event occurs is upper-bounded by 

csd 



< 
< 
< 



for some f3. 

By summing over 1 < s < jn, 

logn 



n\ ( dn 

csj \ s J \ (d — kj)n 

en\ cs ( edn \ s / csd 

cs J \ s 

s\v 



ks 



(d — k^)n 
(iy e k-v c i+r, d k+i( d _ kl yk 



s^ "■' 
n 



y -) <y -) + y - 



o(^ gn ) + o(( 7 /3T los " 

s=l v ' s=l v ' s=logn+l x 

The first term is o(l) and the second term is also o(l) by taking 7 small enough. 



□ 
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B.2 Proof of Lemma 13.51 

Proof. Let us fix an assignment x £ {0, l} n over variables and X e be a random variable indicating 
that the constraint C e is satisfied by the assignment. Then, -E[X e ] = |P -1 (l)|/2 and all X e 
are mutually independent since every b e is mutually independent. Let X = ^X e , then from 
Hoeffding's inequality, Pr[|X - E[X]\ < eE[X]] < exp(-Q(e 2 dn)). By choosing d = ^(1/e 2 ), the 
union bound over all 2 n possible assignments over variables yields the desired results. □ 

B.3 Proof of Lemma 15771 

Proof. After the t-th interaction, the number of vertices in the query-answer history is at most kt. 
Thus, the sum of the remaining degrees of those vertices is at most dkt. On the other hand, the 
sum of the remaining degrees of other vertices is at least dn — dkt. Thus, the probability that the 
i-th vertex (1 < i < k) at the edge for the t-th answer is contained in the query-answer history is 
at most dkt/(dn — dkt) < 2dkt/dn when t < n/2k. Therefore, the expected number of cy(G) is at 
most 

22 k ■ —7— < 0(k 2 n 2S ). 
t=i n 

From Markov's inequality, the lemma follows. □ 

B.4 Proof of Lemma 13.81 

Proof. Let qt = (vt, it) be the t-th query by A. After the t-th interaction, the number of vertices in 
the query-answer history is at most kt. Since the degree is bounded by d, the number of vertices 
in the query-answer history whose distance from vt is at most g is at most (dk) g . Thus, the sum 
of the remaining degrees of such vertices is at most d{dk) 9 . On the other hand, the sum of the 
remaining degrees of other vertices is at least dn — d{dk) 9 . Thus, the probability that the i-th 
vertex (1 < % < k) of the edge for the t-th answer is contained in the query-answer history is at 
most d(dk) 9 /(dn — d(dk) g ) < 2d(dk) g /dn. The last inequality is from g < (log dk n)/2. Therefore, 
by union bound, the probability that such an event occurs is at most, 

k ndki 1/2+s = 0{2kn - S) = o(1) _ 

dn 

□ 

B.5 Proof of Lemma 13.91 

First, for an edge e and a vertex v £ e, we show that x„ and x e are independent. 

Lemma B.l. Let H be a hypergraph and let e be an edge of H. Then, for any vertex v G e, x„ 
and b e are independent under T>h ■ 

Proof. We show that b e is uniform after we choose the value of x„. Let e = (vi, . . . ,Vk) and we 
assume that x Vl = without loss of generality. Then, T>h generates x t , 2 , . . . , x„ fe uniformly at 
random. Let x S {0, l} fc_1 be the vector of chosen values. Then, T>h chooses b e from the set 
S x = {b | P((0,x) + b) = 1}. Let s = |P _1 (1)| be the size of S. Here, we separate P _1 (l) into 
s/2 couples of vectors (p,p). Let (pi,Pi), ■ ■ ■ , {p s /2-,Vs/2) be the set of such couples. Then, S x is 
partitioned into S Xj i = {b \ (0,x) + b = pi or (0, x) + b = pj}(l < i < s/2). We consider the 
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set Si = U^ejo lj^- 1 Sx,i- Then, it is easy to see that S{ = {0, l} fc , i.e., every vector from {0, l} k 
appears exactly once in S X i(l < i < s/2). Thus, eventually, b e is distributed uniformly at random 
in{0,l} fc . D 

Proof of Lemma \3.9l We use induction on the number of edges of T. When T consists of one edge, 
the lemma holds from Lemma IB. II 

Let m > 2 be an integer. Assume that, for any hypertree T = (V,E) with \E\ < m and v £V, 
x v and b^ are independent under T>jj. Let T = (V,E) be a hypertree with \E\ = m and v be 
the supposed vertex. Since m > 2, there exists an edge e such that e contains a leaf, but does 
not contain v as a leaf. Let w be the unique vertex that connects T — e and e. Note that w may 
coincides with v. Then, 

Pr[b e |b B _ e ] 
= ^Pr[b e |x^] Pr[xjb £ _ e ] (from b e _LL b E _ e \ x w ) 

= \JP r [b e ] Pr[x„,|b£;_ e ] (from Lemma |B~T|) 

= Pr[b e ]. 
Thus, 

Pr[x^,x„,,b £ ] y^ PrtxvjbB-elxwjbejPrpbelxwlPrfxu,] 



Pr[x„|b B ] = ^ — — = ^ 



Pr[b £ ] f- Pr[b £ _ e ]Pr[b e |b B _ e ] 

SincePr[x^,b£;_ e |x w ,b e ] = Pr[x„,b£j_ e |x„,], Pr[b e |x„,] = Pr[b e ] from Lemma lB.ll and Pr[b e |bg_ e ] = 
Pr[b e ], this is equal to 

E Pr[x^,b g _ e |x^]Pr[b e ]Pr[x^] _ ■^ Pr^j^bE-e] _ p r ,, , 
Pr[b s _ e ]Pr[b e ] -^ Pr[b £ _ e ] -^l^-ej. 

From the assumption of the induction, we have Pr[x„|b£j] = Pr[x„]. D 

B.6 Proof of Lemma 13.101 

Lemma B.2. Let T> and T>i(l < i < k) be distributions generating x E {0,1}. And, suppose that 
Ppd[x = x] is given by 

5? i* = a = „» „ n -'! ri tr ] 



° ni =1 PrB,[x = 0] + nr,iPrp,[x=l] 

Then, dTV^Dfa)] — Xa=i ^ry[^i( x )]- ^ e equality only holds when (at least) k—\ of dTv[^i( x )](^- < 
i < k) are 0. 

Proof. We use induction on fc. When A; = 1, we have nothing to prove. 

Suppose that the lemma holds when k < t. Now, we show that the lemma also holds when k = t. 
For notational simplicity, we define p x = L] * =1 Prx^ [x = x\. Then, Prx>[x = x] = p x /(p x +p\_ x ). 
Let T>' be a distribution generating x 6 {0, 1} in such a way that 



nt-l 

Prx = x 



Pi- 



C ' L J j4 _1 +p^ 
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Prom the assumption, we have drry\p'{x)] < S»=i ^rvt^W]- Also 



Prbc = xl 
v 



pi 



Px + P\- 



Px Pr »t [ x = X] 



pl Pr Vt [x = x] + pi4 Pri, t [x = 1 - x] 

Prp/ [x = x] Pr Pt [x = x] 

Pre [x = x] Pru t [x = x] + Pi'D' [x = 1 — x] Pr£> t [x = 1 — x] 



Let 5' = Prx>/[x = 0] - 1/2 and <5 t = Pr© t [x = 0] - 1/2 where -1/2 < <5', <5 t < 1/2. Then, 



Prfx = 01 - 
V 


1 
~ 2 


= 




(I + 5')(i + <5 t ) 


i 


<5' + 5 t 
1 + 4<5'<5i 


K^ + Cs-^'Ks-^) 

<I*1 + N- 


2 



(1) 



Similarly, |Pr©[x = 1] - 1/2| < \5'\ + \5 t \. Thus, d T y[£>(x)] < 2{\5'\ + \8 t \) < d TV [V'{yi)} + 
*rv[A(x)]<EL 1 drv[X) i (x)]. 

We finally remark about the condition that the equality holds. We note that the case 5' = 1/2 
and 5t = —1/2 or vice versa cannot happen since in this case we cannot decide the value of x. 
Thus, the equality of ([TJ only holds when 5' or St equals zero. Also, div[£>(x)] becomes non-zero 
if dTv[D' (it)] or c^y PM X )] is non-zero. Thus, the claim holds. □ 



Proof of Lemma \3.10l We note that x^ _LL xj> | x„ for 1 < i,j < £ under Dh('\^e)- Since 
T>h( x v\^e) is uniform from Lemma 13.91 we have from Lemma 12.31 that 



Prfx,, = x\{x Li Y i=1 ,b E ] 



n i= iPr[x„ = x|x Li ,b £ ] 



Ex* IL=i Pr [x«> =x'|x Li ,b B ] 
Then, by applying Lemma |B. 2\ we have the desired result. 



□ 



B.7 Proof of Lemma 13.111 

Proof. Letp x = Pr[x e = xKx^}^ 1 ,^] for x G {0,l} fc . Also, we let 5 t = Pr[x v . = 0|x L .,b E ]-l/2 
where —1/2 < 5i < 1/2 and S = Swpp(T>jj (x e |be)) C {0, 1} . Then, from Lemma 

rfc-1 



nr = i(V2 + (-ir^) 
Ex^siitiW + i-ir'^ 



Let S v = {x £ S\xk = v}, then 



dTv[D„(K„ k \{K L ,}^,b E )} 



£*»-£ 



111 



xeSi 



x£S 



x£Si 



Px~Px\= ^2 ( Px + P*) 
xeSi 



\Px ~ Px\ 
Px+Px 



For each x G Si, we can think of a random (Boolean) variable that takes 1 with probability 
Px/(px + Px) and with probability Px/(Px + Px)- Then, \p x ~Px\/(px + Px) can be seen as the 
total variation distance of this random variable. Since the probability distribution of this random 
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variable exactly matches the condition of Lemma IB.2I (note that the denominator of p x is canceled 
out), we have 

fe-i 
dr V [V H (x Vh \{xL Li }^,b E )} < ^(^.+ te )^d T y[pH(x,Jx Li ,b s )] (2) 

xeSi i=i 

fe-i 
= ^2d TV [V H (x Vt \x Li ,b E )]. 
i=i 

This already indicates that p{P) < 1. 

We fix P / EQU and let S = (S u . . . , <5 fe _i). Let A = {S e [-1/2, 1/2]*- 1 | ££ -1 |£<| < 1 + e} 
for e > chosen later. This excludes singular points implied by the left hand side of ([T]). We define 

p{6) = rfTy[P g (x,J{x L J^- 1 1 ,b E )] 
Eti 1 d rv \Ph (x„. |x Li , b B )] 
_ I z2xeSi Px ~ 2Jzg5o P a I 
Eti2|*| 

We can safely state that p(<5) < 1/(1 + e) in [-1/2, 1/2]* -1 - D e . 

Next, we only consider the domain D+ = D e (~) [0, l/2] fc_1 . Other domains (i.e., D e — [0, l/2] fc_1 ) 
can be treated in the same manner. After a calculation, we can see that the limit at 5 = (0, . . . , 0) 
exists and the value is less than one when P ^ EQU. Thus, p{5) is continuous in Df. In particular, 
p{5) is uniformly continuous. 

We will show that there exists a universal constant p < 1 such that p(S) < p regardless of S E Df. 
This concludes the lemma. For e > 0, we define H e = D+ n {5 G [0, l/2] fc_1 | all but one of Si < e}. 
When S H e , considering the inequality (|2|) in the proof of Lemma [B.2l there exists some constant 
p < 1 such that p(S) < p. 

The remaining case is S € H e . We will show that there exists a constant p < 1 such that 
p(S) < P for (5 £ i^o- From the uniform continuity of p(5), by choosing e > small enough, we 
establish the desired result. 

Let S G Hq. Without loss of generality, we can assume that Si > and 82 = ■ ■ ■ = Sk-i = 0. 
For p > 0, we consider the following function of p and S. 

fe-i 

f(p,5) - pY,dTVpH^vML t ,h E )}-d TV [V H ( Xvk \{x Li }^,h E )} 

xES± xGSq 




min ( p ^2 2Si ~ ( ^2 Px - ^2 Px \ , p^2^i + i ^2 Px - ^2 Px 

i=i yxeSi xg5 / i=i \xeSi x&s 

=: mm(fi(p,S),f 2 (p,S)) ■ 

We let gip, 5i) = f(p, Si, 0, . . . , 0). If the minimum of gip, Si) over < Si < 1/2 is non-negative, 
we can say that piS) < p for S G Hq. Since the denominator of g (after factoring) is non- negative, 
we only consider its numerator g =: mm{gi,g2}. We note that gi,§2 are odd functions and the 
degrees of 51,52 are at most two. Thus, gi,c/2 is a linear function of Si when we fix p. Suppose 
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that g(p,5i) < for some 5\ G [0,1/2]. Then, by moving 5\ to or 1/2, we obtain a smaller 
value. Thus, it suffices to check the case 5\ = and 5\ = 1/2. When 5% = 0, we have already seen 
that p(0, ... ,0) < 1. The case <5i = 1/2 corresponds to the following question: how much can we 
guess the value of x Vk when we know the actual value of x Vl and we do not know values of other 
variables? For any symmetric predicate except EQU, the choice of x„ fc is not unique. Thus, we can 
choose p < 1 that only depends on P. □ 

C An n(n l / 2+s ) Lower Bound for Testing 2-XOR 

In this section, we give the proof of Theorem 11.21 To make hard instances that are e-close to 
satisfiability, we slightly modify the construction of £> S at- We use the same Df ar as defined in 
Section 

Definition C.l. Let H = (V,E) be a graph with n vertices. Let e be an error parameter. Define a 
distribution T>n,e generating an instance <I> of 2-XOR as follows. The variable set of & is {x v } v ^y 
We choose x E {0, l} n uniformly at random. For each edge e = (u, v) £ E, we choose b e uniformly 
at random from the set {b £ {0, l} 2 | P((x u ,x v ) + b) = 1} with probability 1 — e, and from the set 
{b G {0, l} 2 | P((x u ,~x v ) + b) = 0} with probability e. Then, we add a constraint C e of the form 
P{(x u ,x v )+b e ) = 1 to $. 

Definition C.2. Given parameters n,d,e, define a distribution 2? sat generating an instance of 2- 
XOR as follows. First, we choose a graph H from Gn,d,2- Then, an instance is output according to 
V H ,e- 

The following lemma is immediate. 

Lemma C.3. For any e > and d > 1, the following holds. Let $ be an instance of 2-XOR chosen 
from 2?sat- Then, <3? is (e/2)-close to satisfiability with a probability of 1 — o(l). □ 

We use V sat defined above instead of 2? sa t defined in Section [3] to prove Theorem [L2j The proof 
is almost same as the proof of Theorem II. 11 A modification occurs only in the proof of Lemma [3.111 
The following is an analogue of Lemma 13.111 for the distribution T>H,e 

Lemma C.4. Let T = (V,E) be a subgraph of a graph H. Suppose that T is a tree. Let T u be the 
subtree obtained by removing e = (u,v) £ E. Here, u is the root of T u . Let L be a subset of the 
leaves of T u . Then, 

dTv[£>H,e(x„|x L ,b E )] < (I - 2e)d TV [V Hte (x u \x L ,bE)}. 

Proof. For simplicity, we assume that b e = 0. We can prove other cases in the same manner. It 
holds that 

Pr[x„ = 0\x L ,b E } = y]Pr[x„ = 0|x M ,b g ] Pr[x u \x L ,b E ] 

= (1 -e)Pr[x u = l|x L ,bE] +ePr[x u = 0|x L ,b s ]. 
Let 5 = Pr[x u = 0|x L , b E ] - 1/2. Then, 



Pv[x v = 0|x u ,b s ] - - 



(l-2e)|<5|. 



Similarly, |Pr[x„ = l|x L ,b £ ] - 1/2| < (1 - 2e)|<5|. Thus, d TV [V Hte (x v \x L ,b E )} < 2(1 - 2e)|<J| < 
(1 - 2e)drvpH,e(xu\xL)]- □ 
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Proof of Theorem \1.2l Combining the proof of Theorem 11.11 and Lemma IC.41 the theorem holds. 
Since we use 1 — 2e instead of p(P) as a decaying factor, we need to choose 5 = 0(e/ log(k/e 2 )). □ 

D A Linear Lower Bound for One-Sided Error Testers 

Theorem D.l. Let P : {0, l} k — > {0, 1} be any symmetric predicate except EQU where k > 3. 
Then, for any e > 0, there exists d > 1 such that any one-sided error (|P _1 (0)|/2 fc — e) -tester for 
CSP(P) with a degree bound d requires fi(n) queries. 

Proof. Let $ be a given instance. Since a one-sided error tester must accept $ when $ is satisfiable, 
it cannot reject <1> unless it has found an unsatisfiable sub-instance of <J>. We show that for any 
e > there exists d for which the following holds: there exists an instance $ of CSP(P) with a 
degree bound d such that any linear-size sub- instance is satisfiable while <£ is (|P _1 (0)|/2 — e)-far 
from satisfiability. The lemma clearly holds from this fact. 

From Lemma 13.51 for any e > and r] > 0, there exist d > 1,7 > 0, and an instance <£ 
with a degree bound d such that $ is (\P~ 1 (0)\/2 k — e)-far from satisfiability and the underlying 
hypergraph is a (7, ^-expander. Let <&' be a sub-instance of <I>, and let V (<&') and E(&) denote the 
set of variables and constraints of 3>', respectively. We show that 3>' is satisfiable when |2?(<&')| < jn 
by induction on | £(<£') |. 

Clearly, any sub-instance with no constraint is satisfiable. Suppose that any sub-instance of 
<3? with less than m constraints is satisfiable. Let <&' be a sub-instance of $ with m constraints. 
Then, since H is a (7, ^-expander, |y(<£')| > (k — 1 — rj)\E($>')\. Since 77 < 1, there exists some 
constraint C £ E(&) such that C shares at most two variables with E(&) — C. Suppose that C 
shares two variables x u ,x v with E(&) — C. Note that P is a symmetric predicate except EQU. If 
P accepts x G {0, l} k with |x| = 1, then P accepts x with |x| = k — 1. If not, there exists some 
2 < w < k — 2 such that P accepts x with |x| = w. Thus, P accepts x £ {0, 1} with \x\ = w for 
some 2 < w < k — 1. Hence, regardless of the values of x u ,x v , we can satisfy C by appropriately 
choosing the values of the rest of the variables in C. Other cases are similar. Thus, the induction 
completes and the theorem follows. □ 

Proof of Theorem[T3 Let Q : {0, l} k ->■ {0, 1} be a predicate such that P^ 1 ) ^ Q^i 1 ) for a 
symmetric predicate P{0, l} k — > {0, 1} except /c-EQU. As the proof of Theorem II. 1| We change the 
definition of T>h- Thus, for each edge e = (vi, . . . , v^) of H , we choose b e uniformly at random from 
the set {b£ {0, l} k | P((x„ 15 . . . ,x„J +b) = 1} instead of {b G {0, l} k | Q((x„ 1 , . . . ,x„J + b) = 1}. 
Then, the rest of the proof is the same as the proof of Theorem ID. It □ 

E An e-Tester for /c-EQU 

In this section, we prove Theorem 11.41 The idea is to transform an instance of A;- EQU into a graph 
and use a bipartiteness tester given in |14j . 

We define a reduction if, which maps an instance # of /c-EQU to an instance $' of 2-EQU. The 
set of variables of $' is the same as $. For each constraint in $ of the form l\= li = ■■■= Ik-, where 
each li is a literal, we simply introduce k(k—l)/2 constraints in <£' of the form li = £j(l < i < j < k). 

Lemma E.l. If & is a satisfiable instance of k-EQU, then tp(&) is satisfiable. On the contrary, if 
<3? is e- far from satisfiability, then (p(<&) is e' -far from satisfiability where e' = 2e/k. 

Proof. Let $' = v 3 ( < ^)- The former part is obvious. Furthermore, if <3?' is satisfiable, then $ is 
satisfiable. 
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We show the latter part. Suppose that $' is not e'-far from satisfiability. Since the degree 
bound of $' is (at most) d! = dk, we can make $' satisfiable by removing less than e'd'n/2 = edn/k 
constraints. Let <3?(. m be the resulting instance. 

We simulate this removal in $. That is, for each removed constraint in <£>', we remove the 
corresponding constraint in <£. Let 3> rm be the resulting instance of /c-EQU. The number of removed 
constraints is at most edn/k. The important fact is that ^($ m ) is a sub-instance of $J. m . Since 
<&£ m is satisfiable, $ rm is also satisfiable. However, this contradicts the fact that <J> is e-far from 
satisfiability. □ 

Next, we define a reduction tpQ, which maps an instance <3? of 2-EQU to a graph G. First, each 
literal of $ forms a vertex in G. Next, for each variable x of $, we introduce an edge (x,x) in 
G. We call these edges variable edges. Furthermore, for each constraint in <1> of the form t\ = £2 
where £\ and £2 are literals, we introduce two edges (^1,^2) and (^1,^2) in G. We call these edges 
constraint edges. The supposed bipartition of G is into the set of literals whose values are 1 (true) 
and (false) in the solution of <£. 

Lemma E.2. If $ is a satisfiable instance of 2-EQU , then (pc{^) is satisfiable. On the contrary, 
if ^ is e-far from satisfiability, then <£g<($) is e'-far from satisfiability where e' = e/(Ad). 

Proof. Let G = (V,E) = (pc($>). The number of vertices of G is 2n, where n is the number of 
variables of <I> and the degree bound d of G is the same as <£. The former part of the lemma is 
obvious. Furthermore, if G is bipartite, then <3? is satisfiable. 

We show the latter part. Suppose that G is not e'-far from satisfiability. Let E' C E be the set 
of edges such that G becomes bipartite by removing E' and \E'\ < e'd{2n)/2. First, we canonicalize 
E' so that E' does not contain variable edges. This is done as follows. If E' contains a variable 
edge (x,x), we exclude the edge from E', and instead we add to E' every constraint edge of the 
form (x,£) and (x,£) where £ is a literal. This preserves the property that G becomes bipartite by 
removing E'. Since the degree bound of G is d, after canonicalizing E' , the size of \E'\ is at most 
2d ■ e'd(2n)/2 = edn/2. Let G rm be the resulting graph after removing E'. 

We simulate this removal in <J>. That is, for each removed edge in G, we remove the corresponding 
constraint in <£. This can be done since we excluded variable edges. Let $ rm be the resulting 
instance of 2-EQU. The number of removed constraints is at most edn/2. Again, ip(& rm ) is a sub- 
instance of G rm . Since G rm is bipartite, <3? rm is satisfiable. However, this contradicts the fact that 
<3? is e-far from satisfiability. □ 

Finally, we use the following algorithm for testing bipartiteness. 

Lemma E.3. \1$ There exists a one-sided error e-tester for bipartiteness whose running time is 
0(y / npoly(logn/e)), where n is the number of vertices. 

Proof of Theorem \l-4\ Combining Lemmas IE. 11 IE. 21 and IE.31 the theorem holds. □ 



F A Linear Lower Bound for Testing /c-CSP 

In this section, we show that there exists a certain predicate P such that, CSP(-P) requires lin- 
ear number of queries to distinguish satisfiable instances from instances (1 — 2k/ 2 — e)-far from 
satisfiability. Then, we show the hardness of MIS. We use a matrix to define the predicate. 

Definition F.l. For a matrix A G {0, l} /ixfc ; a predicate Pa '■ {0, l} k — > {0, 1} is defined as 

P A (xi,...,x k ) = 1 <=> A- (xi,...,x k ) T = 0. 
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The matrix A is called a generator matrix of Pa- 

Since A(x + b) = Ax + Ab, we posit that a constraint of an instance of CSP(P4) is of the form 
A- (xi,...,x k ) T = (bi,...,b h ) T . 

As a hard generator matrix, we use a linear code. A linear code of distance 3 and length k 
over {0, 1} is a subspace of {0, 1} such that every non-zero vector in the subspace has at least 3 
non-zero entries. We refer to the code below as Hamming code of length k. 

Fact F.2. Let 2 r ~ 1 — 1 < k < 2 r — 1. Then, there exists a linear code of distance 3 and length k 
over {0, 1} with dimension h = k — r. 

In particular, \P^ (1)| = 2 r < 2k holds for Hamming code A. 

We define two distributions V sat and Vf ar of instances of CSP(Pa) using Hamming code A. 
From Lemma 13.11 for any rj > and d > 1 , there exists 7 > such that we have a k- uniform 
(7, ?])-expander H = (V, E) with n vertices and a degree bound d. We use H as an underlying 
hypergraph of instances generated by V sat and T>f ar . 



• 



^sat : We choose x„ € {0, 1} for each v G V uniformly at random. Then, for each edge e 

' v k I V^Vl ) " " " J ^Ufc 



[v\, . . . ,Vk) € E, we introduce a constraint of the form A-(x Vl , . . . , x Vk ) T = A-(x vl , . . . ,x„ fe ) T . 



• Pf ar : For each edge e = (v±, . . . ,Vk) £ E, we choose b e S {0, l} h uniformly at random and 
introduce a constraint of the form A ■ (x Vl , . . . , x Vk ) = b e . 

From the construction, any instance of P sa t is satisfiable. On the other hand, from Lemma [3.51 f° r 
any e > 0, by appropriately choosing d, 1 — o(l) fraction of instances of Pf ar is (|i-'" 1 (0)|/2 fe — e)-far, 
i.e., (1 - 2k/2 k - e)-far. 

Theorem F.3. Let A be Hamming code. Then, for any e > 0, there exists d > 1 such that every 
(1 — 2k/2 k — e) -tester for CSP(Pa) with a degree bound d requires Q(n) queries. 

Proof. Let $ be an instance chosen from P S at- One constraint of CSP(P J 4) consists of a chunk of 
h linear equation. Thus, in total, there exists mh linear equations in <&, where m is the number of 
constraints of $. We can write these equations in the form Mx = b using a matrix M E {0, i} m " xn 
and a vector b G {0, l} mh . Here, M is uniquely determined by the underlying hypergraph H 
regardless of b. 

We show that for any set of s < jn constraints of <£>, the corresponding rows in M are linearly 
independent. Suppose that there exists a set R of s constraints whose corresponding rows are 
linearly dependent. Let S denote the set of variables incident to R. Since every chunk equation 
comes from a distance-3 code, every linear combination of rows within a chunk must have at least 
three elements. Hence, the linear combination required to derive must include at least three 
elements from each of the s constraints. To derive 0, each of these elements must occur an even 
number of times, and hence s constraints can involve at most ks — 3s/2 = {k — 1 — l/2)s variables 
in total. If we choose rj < 1/2, this is impossible. 

Let M'x = b' be a sub-instance obtained by choosing any 771 constraints. Since the rows of M' 
are linearly independent, M'x is also uniformly distributed when x € {0, l} n is chosen uniformly 
at random. Thus, no algorithm can distinguish instances of P sa t from instances of Pf ar with 771 
queries. The theorem follows. □ 

Proof of Theorem 1 1.61 Using a modified version of the FGLSS reduction from Max /c-CSP to MIS 
used in [25], which is tailored for bounded-degree instances, we have this theorem. □ 
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